首页> 外文OA文献 >MSB: A mean-shift-based approach for the analysis of structural variation in the genome
【2h】

MSB: A mean-shift-based approach for the analysis of structural variation in the genome

机译:MSB:一种基于均值漂移的方法,用于分析基因组中的结构变异

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。
获取外文期刊封面目录资料

摘要

Genome structural variation includes segmental duplications, deletions, and other rearrangements, and array-based comparative genomic hybridization (array-CGH) is a popular technology for determining this. Drawing relevant conclusions from array-CGH requires computational methods for partitioning the chromosome into segments of elevated, reduced, or unchanged copy number. Several approaches have been described, most of which attempt to explicitly model the underlying distribution of data based on particular assumptions. Often, they optimize likelihood functions for estimating model parameters, by expectation maximization or related approaches; however, this requires good parameter initialization through prespecifying the number of segments. Moreover, convergence is difficult to achieve, since many parameters are required to characterize an experiment. To overcome these limitations, we propose a nonparametric method without a global criterion to be optimized. Our method involves mean-shift-based (MSB) procedures; it considers the observed array-CGH signal as sampling from a probabilitydensity function, uses a kernel-based approach to estimate local gradients for this function, and iteratively follows them to determine local modes of the signal. Overall, our method achieves robust discontinuity-preserving smoothing, thus accurately segmenting chromosomes into regions of duplication and deletion. It does not require the number of segments as input, nor does its convergence depend on this. We successfully applied our method to both simulated data and array-CGH experiments on glioblastoma and adenocarcinoma. We show that it performs at least as well as, and often better than, 10 previously published algorithms. Finally, we show that our approach can be extended to segmenting the signal resulting from the depth-of-coverage of mapped reads from next-generation sequencing.
机译:基因组结构变异包括节段重复,缺失和其他重排,基于阵列的比较基因组杂交(array-CGH)是确定这一点的流行技术。从阵列-CGH得出相关结论需要将染色体分成增加,减少或未改变拷贝数的片段的计算方法。已经描述了几种方法,其中大多数方法试图根据特定的假设对数据的基础分布进行显式建模。通常,它们通过期望最大化或相关方法来优化似然函数,以估计模型参数。但是,这需要通过预先指定段数来进行良好的参数初始化。而且,由于需要许多参数来表征实验,因此难以实现收敛。为了克服这些局限性,我们提出了一种无需全局准则进行优化的非参数方法。我们的方法涉及基于均值漂移(MSB)的过程;它认为观测到的阵列CGH信号是从概率密度函数中采样的信号,它使用基于核的方法来估计该函数的局部梯度,然后反复跟踪它们以确定信号的局部模式。总的来说,我们的方法实现了鲁棒的不间断保留平滑,从而将染色体准确地分割成重复和缺失区域。它不需要输入段数,其收敛也不依赖于此。我们成功地将我们的方法应用于胶质母细胞瘤和腺癌的模拟数据和阵列-CGH实验。我们证明了它的性能至少与以前发布的10种算法一样好,并且常常比它们更好。最后,我们证明了我们的方法可以扩展为对来自下一代测序的映射读段的覆盖深度产生的信号进行分段。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号